Santa Barbara
Jeff Bridges Is Digging It
The interior of Jeff Bridges's garage, in Santa Barbara, California, has the ramshackle ease of an extravagant dorm room: a tiger-print rug, a potter's wheel, guitars, a rogue toothbrush, taped-up printouts of ideas he finds provocative or perhaps grounding ("Enlightenment is a communal experience"), and piles of books, from Richard Powers's "Bewilderment" to "Who Cares?! A black-and-white portrait of Captain Beefheart, incongruously dressed in a jacket and tie, hangs on a wall near an electric piano. When I arrived, on a recent afternoon, I did not take note of a lava lamp, but its presence didn't feel out of the question. Bridges was wearing rubber slides and a periwinkle-blue cardigan. He excitedly spread out a large furry blanket on a recliner and invited me to sit down: "Your throne, man!" he said. Earlier this month, Bridges released "Slow Magic, 1977-1978," a series of songs he recorded when he was in his late twenties, an emergent movie star, and involved in a regular Wednesday-night jam session with a coterie of musicians and oddballs from the west side of Los Angeles (the jams were organized by Steve Baim, who attended University High School with Bridges; they took place in various beach houses and, occasionally, at the Village, the recording studio where, around the same time, Fleetwood Mac was making "Tusk"). "Slow Magic" is great and also bonkers. On "Kong," Bridges recounts a story line he pitched for a potential "King Kong" sequel (in 1976, Bridges starred as the long-haired primatologist Jack Prescott in a "Kong" remake produced by Dino De Laurentiis); the track features animated narration from the actor Burgess Meredith, and its lyrics are centered on the revelation that Kong is actually a robot. "It's a sad story, but he was just a monkey machine!" Bridges wails in a tottering falsetto. On "Obnoxious," a weirdly tender song about feeling sad and having a stomachache ("I went to the bathroom / And threw up"), there are echoes of Frank Zappa and the Band. What I like most about the record is how social it feels: friends in a room, being dumb, intermittently (even inadvertently) doing something miraculous. "When recording technology kept improving, I said, 'Oh, I don't need anybody!
Fragment-based Pretraining and Finetuning on Molecular Graphs
Property prediction on molecular graphs is an important application of Graph Neural Networks (GNNs). Recently, unlabeled molecular data has become abundant, which facilitates the rapid development of self-supervised learning for GNNs in the chemical domain. In this work, we propose pretraining GNNs at the fragment level, a promising middle ground to overcome the limitations of node-level and graph-level pretraining. Borrowing techniques from recent work on principal subgraph mining, we obtain a compact vocabulary of prevalent fragments from a large pretraining dataset. From the extracted vocabulary, we introduce several fragmentbased contrastive and predictive pretraining tasks.
Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation
Wang, Yu, Zhang, Jiaxin, Gao, Xiang, Cui, Wendi, Li, Peng, Das, Kamalika
In tasks like summarization and open-book question answering (QA), Large Language Models (LLMs) often encounter "contextual hallucination", where they produce irrelevant or incorrect responses despite having access to accurate source information. This typically occurs because these models tend to prioritize self-generated content over the input context, causing them to disregard pertinent details. To address this challenge, we introduce a novel method called "Guided Attention Map Editing" (GAME), which dynamically adjusts attention maps to improve contextual relevance. During inference, GAME employs a trained classifier to identify attention maps prone to inducing hallucinations and executes targeted interventions. These interventions, guided by gradient-informed "edit directions'', strategically redistribute attention weights across various heads to effectively reduce hallucination. Comprehensive evaluations on challenging summarization and open-book QA tasks show that GAME consistently reduces hallucinations across a variety of open-source models. Specifically, GAME reduces hallucinations by 10% in the XSum summarization task while achieving a 7X speed-up in computational efficiency compared to the state-of-the-art baselines.
Generalizable Machine Learning Models for Predicting Data Center Server Power, Efficiency, and Throughput
Lei, Nuoa, Shehabi, Arman, Lu, Jun, Cao, Zhi, Koomey, Jonathan, Smith, Sarah, Masanet, Eric
In the rapidly evolving digital era, comprehending the intricate dynamics influencing server power consumption, efficiency, and performance is crucial for sustainable data center operations. However, existing models lack the ability to provide a detailed and reliable understanding of these intricate relationships. This study employs a machine learning-based approach, using the SPECPower_ssj2008 database, to facilitate user-friendly and generalizable server modeling. The resulting models demonstrate high accuracy, with errors falling within approximately 10% on the testing dataset, showcasing their practical utility and generalizability. Through meticulous analysis, predictive features related to hardware availability date, server workload level, and specifications are identified, providing insights into optimizing energy conservation, efficiency, and performance in server deployment and operation. By systematically measuring biases and uncertainties, the study underscores the need for caution when employing historical data for prospective server modeling, considering the dynamic nature of technology landscapes. Collectively, this work offers valuable insights into the sustainable deployment and operation of servers in data centers, paving the way for enhanced resource use efficiency and more environmentally conscious practices.
From superposition to sparse codes: interpretable representations in neural networks
Klindt, David, O'Neill, Charles, Reizinger, Patrik, Maurer, Harald, Miolane, Nina
Understanding how information is represented in neural networks is a fundamental challenge in both neuroscience and artificial intelligence. Despite their nonlinear architectures, recent evidence suggests that neural networks encode features in superposition, meaning that input concepts are linearly overlaid within the network's representations. We present a perspective that explains this phenomenon and provides a foundation for extracting interpretable representations from neural activations. Our theoretical framework consists of three steps: (1) Identifiability theory shows that neural networks trained for classification recover latent features up to a linear transformation. (2) Sparse coding methods can extract disentangled features from these representations by leveraging principles from compressed sensing. (3) Quantitative interpretability metrics provide a means to assess the success of these methods, ensuring that extracted features align with human-interpretable concepts. By bridging insights from theoretical neuroscience, representation learning, and interpretability research, we propose an emerging perspective on understanding neural representations in both artificial and biological systems. Our arguments have implications for neural coding theories, AI transparency, and the broader goal of making deep learning models more interpretable.
Analysis of Linear Consensus Algorithm on Strongly Connected Graph Using Effective Resistance
Yonaiyama, Takumi, Sato, Kazuhiro
We study the performance of the linear consensus algorithm on strongly connected graphs using the linear quadratic (LQ) cost as a performance measure. In particular, we derive bounds on the LQ cost by leveraging effective resistance. Our results extend previous analyses -- which were limited to reversible cases -- to the nonreversible setting. To facilitate this generalization, we introduce novel concepts, termed the back-and-forth path and the pivot node, which serve as effective alternatives to traditional techniques that require reversibility. Moreover, we apply our approach to geometric graphs to estimate the LQ cost without the reversibility assumption. The proposed approach provides a framework that can be adapted to other contexts where reversibility is typically assumed.
LLM-USO: Large Language Model-based Universal Sizing Optimizer
S, Karthik Somayaji N., Li, Peng
The design of analog circuits is a cornerstone of integrated circuit (IC) development, requiring the optimization of complex, interconnected sub-structures such as amplifiers, comparators, and buffers. Traditionally, this process relies heavily on expert human knowledge to refine design objectives by carefully tuning sub-components while accounting for their interdependencies. Existing methods, such as Bayesian Optimization (BO), offer a mathematically driven approach for efficiently navigating large design spaces. However, these methods fall short in two critical areas compared to human expertise: (i) they lack the semantic understanding of the sizing solution space and its direct correlation with design objectives before optimization, and (ii) they fail to reuse knowledge gained from optimizing similar sub-structures across different circuits. To overcome these limitations, we propose the Large Language Model-based Universal Sizing Optimizer (LLM-USO), which introduces a novel method for knowledge representation to encode circuit design knowledge in a structured text format. This representation enables the systematic reuse of optimization insights for circuits with similar sub-structures. LLM-USO employs a hybrid framework that integrates BO with large language models (LLMs) and a learning summary module. This approach serves to: (i) infuse domain-specific knowledge into the BO process and (ii) facilitate knowledge transfer across circuits, mirroring the cognitive strategies of expert designers. Specifically, LLM-USO constructs a knowledge summary mechanism to distill and apply design insights from one circuit to related ones. It also incorporates a knowledge summary critiquing mechanism to ensure the accuracy and quality of the summaries and employs BO-guided suggestion filtering to identify optimal design points efficiently.
Evaluating Deep Human-in-the-Loop Optimization for Retinal Implants Using Sighted Participants
Schoinas, Eirini, Rastogi, Adyah, Carter, Anissa, Granley, Jacob, Beyeler, Michael
Human-in-the-loop optimization (HILO) is a promising approach for personalizing visual prostheses by iteratively refining stimulus parameters based on user feedback. Previous work demonstrated HILO's efficacy in simulation, but its performance with human participants remains untested. Here we evaluate HILO using sighted participants viewing simulated prosthetic vision to assess its ability to optimize stimulation strategies under realistic conditions. Participants selected between phosphenes generated by competing encoders to iteratively refine a deep stimulus encoder (DSE). We tested HILO in three conditions: standard optimization, threshold misspecifications, and out-of-distribution parameter sampling. Participants consistently preferred HILO-generated stimuli over both a na\"ive encoder and the DSE alone, with log odds favoring HILO across all conditions. We also observed key differences between human and simulated decision-making, highlighting the importance of validating optimization strategies with human participants. These findings support HILO as a viable approach for adapting visual prostheses to individuals.
GraphMinNet: Learning Dependencies in Graphs with Light Complexity Minimal Architecture
Ahamed, Md Atik, Cheng, Andrew, Ye, Qiang, Cheng, Qiang
Graph Neural Networks (GNNs) have demonstrated remarkable success in various applications, yet they often struggle to capture long-range dependencies (LRD) effectively. This paper introduces GraphMinNet, a novel GNN architecture that generalizes the idea of minimal Gated Recurrent Units to graph-structured data. Our approach achieves efficient LRD modeling with linear computational complexity while maintaining permutation equivariance and stability. The model incorporates both structural and positional information through a unique combination of feature and positional encodings, leading to provably stronger expressiveness than the 1-WL test. Theoretical analysis establishes that GraphMinNet maintains non-decaying gradients over long distances, ensuring effective long-range information propagation. Extensive experiments on ten diverse datasets, including molecular graphs, image graphs, and synthetic networks, demonstrate that GraphMinNet achieves state-of-the-art performance while being computationally efficient. Our results show superior performance on 6 out of 10 datasets and competitive results on the others, validating the effectiveness of our approach in capturing both local and global graph structures.